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We present a genetic algorithm for the atomistic design and global optimisation of substitutionally 
disordered bulk materials and surfaces. Premature convergence which hamper conventional genetic 
algorithms due to problems with synchronisation is avoided using a symmetry adapted crossover. 
The algorithm outperforms previously reported Monte Carlo and genetic algorithm simulations for 
finding low energy minima of two simple alloy models without the need for any redesign. 



INTRODUCTION 

Crystal structure prediction and the design of func- 
tional materials from first principles continue topresent 
a major challenge to the theoretician 0, [2, H, Hf. It is 
unfortunate that the presence of imperfections and disor- 
der make these problems even harder. Nevertheless, tak- 
ing explicitly into account disorder (both impurities and 
defect concentrations far from the dilute limit), rather 
than resorting to various mean-field approaches and sim- 
ple crystallographic models, is of key importance in order 
to understand material properties at their working tem- 
perature and pressure [!, 0] . 

Many materials with defect concentrations far from 
the dilute limit have a substitutional disordered struc- 
ture. Prime examples involve metallic and semiconduc- 
tor alloys as well as many ceramics, all of which are 
of tremendous technological importance within numer- 
ous fields as electric-, magnetic,- and optical components 
or devices. However, substitutionally disordered materi- 
als are in general not well understood from an atomistic 
point of view, and describing the preferential arrange- 
ments of the atoms and defects is essential in order to 
help searching for materials with target properties Q. 
Identifying the local structural variations within disor- 
dered compounds is not only essential for designing new 
materials, but the disorder provided by the solid solu- 
tion of atoms in minerals is also essential for understand- 
ing the dynamics in the earth's interior and of particular 
importance to assess computationally since these com- 
pounds often only exist at conditions (high pressure and 
temperature) which are difficult to reproduce experimen- 
tally (|. 

Taking explicitly into account the variation in the local 
structure is not only important for understanding disor- 
dered bulk materials, but also essential for understand- 
ing many surface phenomena. For example, simulations 
of the deposition of small organic molecules on various 



metal surfaces [§| , and locating the surface-configurations 
which are likely to influence the powder morphology of, 
e.g., UO2 [10] are not trivial due to the multitude of dif- 
ferent possible atomic arrangements. Also, the ability 



to position atoms at will using a STM-tip [llj or the 
opportunities provided by various atomic vapour deposi- 
tion techniques open the way to efficient material design 
in search for configurations/superstructures with certain 
target physical properties which may be difficult to guess 
by chemical intuition. 

In a first attempt to attack substitutionally disordered 
materials by simulation in search for, e.g., low energy 
configurations and superstructures, one could carry out 
an exhaustive and unbiased enumeration of all likely can- 
didates. Alas, such a procedure allows only the smallest 
simulation cells to be investigated. Consider, for exam- 
ple, the archetypal binary fee alloy of gold and copper, 
AuCu. Although the smallest simulation cells can be 
investigated by means of brute force enumerations (for 
instance the small 4 atom unit-cell has only 12 distinct 
arrangements), the number of arrangements explodes 
rapidly with the size of the simulation box. Within a 
2x2x2 simulation cell there are 6 x 10 8 ways of arrang- 
ing the 16 Au and 16 Cu, whereas a 4 x 4 x 4 (256 atom) 
supercell contains 6 x 10 75 configurations. Sadly, often 
much larger cells (> 1000 atoms) are needed for the ac- 
curate calculation of many alloy and mineral properties. 
Although symmetry can be used to reduce the search 
space [12| , the combinatorial explosion of the number of 
potential energy minima cannot be overcome by conven- 
tional means, and other strategies are required. 

Finding low energy minima (including the ground 
state) or configurations with "key properties" for alloys 
are in general nondeterministic polynomial-time hard in 
which heuristics such as for instance genetic algorithms 
(GA) have the potential to be particular suitable 
Indeed, Smith has demonstrated in probably the 
first application of GA for the study of disordered ma- 
terials, that GA are highly efficient in finding low en- 
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ergy configurations for several simple alloy-models. It 
was shown [l4[ that GA typically outperform the popu- 
lar Metropolis Monte Carlo algorithm (MC) [lit . 

Several contributions have demonstrated how GA can 
be applied successfully for the study of crystalline ma- 
terials with substitutional disordered structure. Kim et 
al [3] as well as Dudiy and Zunger [ItJ applied GA for 
the study of semiconductor alloys via the use of an inverse 
band structure approach in which alloy-configuration 
with particularly large band-gaps were searched for. Ap- 
plications of GA in search for low energy configurations 
of simple binary oxide solid solutions have also been suc- 
cessful [HI. 

However, despite these promising attempts, the dy- 
namics of genetic algorithms for the study of strongly 
disordered materials is not well understood and prob- 
lems of choosing a suitable representations as well as the 
role of the different GA-operators require further inves- 
tigations. In particular when bearing in mind that re- 
cent global energy minimisations of very simple ID Ising 
models [lij (which are very similar to that of finding low 
energies of alloys) have shown that conventional GA are 
hampered by slow convergence at the late stage of the 
evolution due to the presence of inherent symmetry. We 
shall therefore take the opportunity to investigate very 
simple alloy models via GA, in an attempt to provide a 
solution to problems which appear to hamper the use of 
conventional GA for the study of substitutional disorder 
and related problems in which the underyling symmetry 
may explain the slow convergence of conventional GA. 

We follow closely Smith [14j . who carried out detailed 
tests on several different simple alloy models and com- 
pared the performance of conventional GA with that of 
MC. First, we give a general introduction to conventional 
GA within the field of alloy modelling. The basic in- 
gredients are briefly discussed. Then, having identified 
and analysed the synchronisation problem which appear 
to hamper conventional GA, we briefly describe an effi- 
cient and general GA for finding alloy-configurations with 
target physical properties, e.g. low energy minima, us- 
ing a simple ID Ising model as an example. We discuss 
the algorithm for two simple 2D alloy models with non- 
trivial representations (genetic encoding), and compare 
with previously reported GA and MC simulations. The 
first model is a simple 2D periodic binary "Ising alloy" 
with nearest neighbour interactions. This model is an 
important benchmark for testing and analysing the per- 
formance of GA since exact analytical solutions exists 
and since the thermodynamic behaviour is well under- 
stood. The second test is also a 2D alloy model, where 
the atoms now interact via a Morse potential with param- 
eters chosen such as to mimic the BiCu alloy. This test 
is particular challenging for conventional GA as demon- 
strated by Smith [14| who found that GA actually per- 
form worse than MC in locating low energy minima. We 
finally compare results from simulations using Ising mod- 



els and Morse potentials such as to analyse the influence 
of the shape of the potential (the distribution of fitness) 
on the performance of GA. 



THEORY 
The alloy models 

Ising alloy 

The first case is a simple periodic 2D Ising model where 
the atoms are distribution on a square lattice. The po- 
tential, originally introduced for the study of ferromag- 
netism in 1925 by Ising [20| . is given as (ignoring external 
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The "spins", <7j, are either +1 (spin up) or —1 (spin 
down) and J is the (exchange) coupling constant between 
the spins. The summation is carried over the nearest 
neighbours only. For the study of binary alloys using 
a potential of the form of equation 10.11 Oi now repre- 
sent an atom located at site i and J is associated with 
the bond energy. We choose J = — 1, such that bonds 
between unlike atoms contribute —1 to the total energy 
and bonds between like pairs contribute +1 to the to- 
tal energy. The potential is thus designed to promote 
a high degree of mixing between the different species. 
For the study of alloys, the summation in equation [03] is 
constrained to maintain the overall composition (or mag- 
netisation). That is 



m 



Oi = constant, 



(0.2) 



throughout the evolution. 



Morse alloy 

Our second test is also a periodic 2D alloy model with 
atoms distributed on a square lattice with lattice con- 
stant 2.56 A. The interactions are modelled using a Morse 
potential of the form 

Vij (nj ) = Dij [exp(-2a. iJ ))-2exp(-a„ (nj -/%))] ■ 

(0.3) 

Here, ry is the interatomic distance between atoms i and 
j. Dij,aij and fyj are parameters which are kept fixed 
for a given interaction (see table EJ. The parameters were 
chosen such as to model the binary BiCu bulk alloy [2lj 
and were employed by Smith [T3 | for benchmarking his 
genetic algorithm. 

In this work we carry out energy calculations using 
equation 2.3 including either only the nearest neighbour 
interactions, or up to the fourth nearest neighbours. 
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TABLE I: Parameters used with the Morse potential. 



Pot D/eV a/A- 1 g/A 



Cu-Cu 0.357 
Cu-Bi 0.154 
Bi-Bi 0.412 



1.386 
1.216 
1.136 



2.82 
3.18 
3.54 



Conventional GA for the study of alloys 

Genetic al gori thms, originally introduced by Holland 
in the 1960s [l3j, are searches for finding good solutions 
to hard problems. In contrast to the popular Metropolis 
Monte Carlo algorithm where changes to a single solution 
are carried out, a population of solutions is evolved. Bet- 
ter solutions are found by allowing the "fitter" members 
of the populations to recombine and produce offspring. 
Random, usually small, changes (mutations) can be al- 
lowed for, such as to introduce "fresh blood" in the pop- 
ulation and hence avoid problems with premature con- 
vergence. The advantage of GA over other optimisation 
algorithms is that large swaps can be carried out in the 
configuration space, which is particular useful within the 
field of global optimisations of ,e.g., nano-clusters due 
to the roughness of the fitness landscape. Within this 
field, there exist reasonable consensus re gard ing the ba- 
sic ingredients constituting efficient GA [22]. However, 
these are slightly different compared to those used for 
the study of periodic alloys and related problems such as 
ferromagnetism. We will therefore discuss the GA oper- 
ations for modelling alloys with emphasis on the roles of 
crossover, mutation, encoding (choice of representation), 
population sizing, fitness etc., and compare in some de- 
tails the present implementation with those of Smith [bj ] 
and Kim et al fill ]. 

Encoding: For the purpose of genetic encoding it is 
useful to distinguish between the "real-space" solution 
(phenotype) and the "encoded" solution (genotype). A 
convenient choice of representation for modelling binary 
alloys is a binary string, x = xiX2---Xk---x^ , where Xk 
is either (if an atom of type A occupy lattice position 
k) or 1 (if an atom of type B occupy lattice position fe), 
and N is the number of atoms in the simulation box. 
Intuitively, one expect GA to be sensitive on how the 



real structure is mapped to the form of x. Smith [14 1 
analysed the sensitivity of the performance of GA in the 
choice of different representations on simple binary al- 
loys, by comparing runs where the atoms were mapped 
in major column form with those of mapping the lattice 
entirely at random. It was thought that GA may slow 
down if the positions of the atoms in the string does not 
reflect the actual spatial arrangement of the atoms. Al- 
though slower, GA were found to perform surprisingly 
well even if a "random binary representation" is used. 



Indeed, Kim [l6[ argue that a non-biased GA should not 
include any information of the spatial arrangements and 
successfully applied a "random binary representation" for 
the study of bulk alloys. By contrast, in ref [l|| a bi- 
nary representation that strongly reflects the spatial ar- 
rangements of the crystal structure was successfully used 
for finding low energy configuration of a binary oxide 
solid solution (the CaO:MgO system). This choice of 
representation seems reasonable due to the presence of 
strong Coulomb forces and the localised nature of the 
ionic bond. In this context, it is worth mentioning that 
for the optimisation of nano-clusters it is widely accepted 
that a real-space representation (which is isomorphic to 
the phenotyp e) p erforms better than a binary representa- 
tion, see e.g. [23j, and work is in progress to address these 
issues in further detail for the optimisation of substitu- 
tionally disordered materials. However, for the optimisa- 
tion of 2D alloy models we have chosen to use the same 
representation as Smith [14] to allow for direct compar- 
ison. That is, the 2D lattice is mapped to a ID binary 
string in major column form, which clearly reflect the 
spatial arrangements of the atoms although more "com- 
pact" binary representations exists (lH ]. 

Population- size: In GA, the initial population is con- 
structed by selecting configurations at random. A suf- 
ficiently large initial population is in general crucial in 
order to avoid premature convergence. However, if the 
(initial) population is too large, the efficiency of the GA 
may slow down. On the other hand, GA are often quite 
tolerant to changes in the population-size. We will dis- 
cuss the influence of population-size on the performance 
of GA below. 

Fitness: The members of the population are assigned 
a "fitness" which measures the quality of the trial so- 
lution. Parents with high fitness are preferentially se- 
lected for, hereby the popular phrase "survival of the 
fittest" . The fitness in our work is the total energy (with 
opposite sign), taken either from the simple Ising model 
fequation lO.ip or from the Morse potential fequation l0.3p . 
However, as discussed above any physical property can 
be used. 

Selection: There are many different schemes for the 
selection of parents. Popular schemes are the roulette 
wheel-, the Boltzmann- and tournament selections. In 
this work we use a binary tournament selection in which 
two pairs are randomly chosen, and where the "fittest" 
member in each pair is allowed to mate. The binary tour- 
nament is convenient, since it does not require any choice 
of parameter that controls the selection pressure other 
than the population-size. It is worth mentioning that we 
have carried out checks with different schemes and found 
that GA are fairly insensitive to the choice of selection 
method but sensitive to the amount of selection [Til ]. 

Crossover: After a pair of parents has been selected, 
the parents are allowed to mate. Several types of mat- 
ing operations (crossover) exist, which are either applied 
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on the genotype or directly on the phenotype (the real 
solution). For the 2D alloy models, we use a standard 
binary representation where the parents are encoded in 
binary form. We use a two-point (TP) crossover, Otp : 
B®B -> E®1 (Bis the configurational space) where 
two different integers k < I € [1,N] are chosen at ran- 
dom, and parents, x = x\X2---XkXk+i---XiXi+i...XN and 
y = yiyi—ykVk+i—yiyi+i—yN are cut between the points 
Xk and Xk+i, and between x\ and x\+\ and spliced as fol- 
lows 



T p(x,y) 



xiX2—xkyk+i—yi%i+i—XN = v 
y-LX2...yk%k+i—xiyi+i—yN = w. 



(0.4) 



One of the children, e.g. v, is chosen at random, possi- 
bly mutated (see below), and added to the population by 
replacing the least fitted member. Note that the points 
where the parents are cut are chosen at random, but 
constrained such as to maintain the overall fixed compo- 
sition. This is a severe constraint, and provides a major 
challenge in designing efficient GA for alloy modelling 
using single and two-point crossover operators. Hence, a 
uniformed crossover, where a random set of bits are ex- 
changed between the parents, has most frequently been 
used for the study of periodic alloys (see e.g. ref [14|). 

Mutation: Random, usually small, mutations, M, 
may be applied typically with a small probability (< 
0.01). In a few cases, we use a permutation operator 
which exchange a pair of different atoms at random po- 
sitions fc and I 



M(z) = M(ziz 2 ...z k ...zi...z N ) = ziz 2 ...zi...z k ...z N . 
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FIG. 1: The populations at three stages using a conventional 
genetic algorithm applied to a ID 100 spin Ising model with a 
fixed 50:50 composition of species A and B. Black and white 
squares are used to represent the different species. Top figure 
shows the initial population, the middle figure is the popu- 
lation after 50 generations (80 fitness evaluations), and the 
bottom figure shows the population after 100 steps (130 fit- 
ness evaluations). 



Conventional GA and synchronisation 

Although GA have been highly successful for the study 
of alloys and has the potential to out perf orm the popular 
Metropolis Monte Carlo method [14, 1(| HI, the optimi- 
sation of various Ising models via evolutionary algorithms 
have shown that conventional GA are hampered by slow 
convergence [l^. Simulations carried out in refs 14, T3| 
have shown that the use of highly specialised GA with 
modified mating schemes and rather "brutal" mutations 
(where large blocks of atoms were exchanged or scram- 
bled with high probabilities (> 0.1]) were necessary to 



find global minima. It was argued 14j , that such mating 
operations and mutations are required due to the con- 
straint of fixing the overall chemical composition which 
severely limits the search-space of the genetic algorithm 
and leaves very little flexibility for the crossover to work 
properly. However, van Hoyweghen et al [l|| carried out 
a detailed analysis of the dynamics of GA for finding low 
energy minima of an Ising ferromagnet without any such 
constraints and observed a similar slow convergence. 



Following van Hoyweghen et al fl9| |. we have analysed 
the convergence of a genetic algorithm for ID Ising mod- 
els with periodic boundary conditions. Tests were car- 
ried out with J — 1 and J = -1 with and without the 
constraint of fixing the overall composition (or magneti- 
sation if viewed as a spin-problem) . In all these cases, we 
found, in agreement with ref [19j, that conventional GA 
(without mutations) are unable to find global minima of 
a 100 spin Ising model, even with huge population sizes. 

To visualise the dynamics of a typical conventional ge- 
netic algorithm with emphasis on the role of the crossover 
when solving Ising-like problems, we show in figure [Tj 
snapshots of an evolution using a 100 spin Ising model 
with J = 1 and fixed overall composition as an exam- 
ple. A steady state genetic algorithm, where the child 
replaces the worst fit member in the population, is used. 
The population-size is 30 and no mutations were applied. 

Although the crossover is very efficient in finding low 
lying minima rapidly, the efficiency slows down very early 
in the evolution. After only 50 generations the diversity 
of the population is lost and the crossover has become 
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useless. As can be seen, there are no recombinations 
which enables the building blocks within the population 
to combine to form higher order building blocks of an 
optimium. 

The failure of conventional GA in finding a global min- 
imum by means of recombinations is explained by the loss 
of a driving force due to the high degeneracy of symmet- 
rically equivalent low energy configurations which in gen- 
eral are far away from one another in configuration space. 
A modest sized population does not contain sufficient di- 
versity to "differ" between the building blocks of the low 
energy minima and the global minima because the high 
order building blocks (schemata) of the global minima 
are very similar compared to those of all other low energy 
minima. That is, the schemata of the global minima and 
low energy minima contain all high order building blocks 
such as **1100**, **illl** and **0000***. However, 
since the number of global minima are outnumbered by 
the low energy excited states and since furthermore sym- 
metrically equivalent low energy minima (including the 
global minima) are far away from one another in the con- 
figurational space (in terms of Hamming distance), the 
conventional GA are unable to find global minima, and 
are stuck in a local optimum. 

As can be seen in figure [TJ the lack of synchronisation 
is manifested in the representation as domains of **00** 
and **11** at fixed spin-positions when conventional GA 
are used. The solutions found are low lying minima in- 
deed, but clearly far away from the global minima in 
Hamming distance. Although the performance of con- 
ventional GA for attacking different Ising models can be 
improved using specialised operators [T5J , the strength of 
the crossover, which should play a major role in designing 
efficient GA, is not fully utilised. 



Symmetry adapted GA-operators 

Assume that the phenotype possesses some symmetry 
associated with a group, say £f , of order K, and that 
the representation chosen is isomorphic to the phenotype. 
Applying a random clement G a £ S? on an arbitrary con- 
figuration, x, leaves the fitness, f, invariant 



/((G a (x)) = /(x) 



(0.6) 



without affecting the "schemata" of x. At this point, it 
is worth emphasising, that x now represents the real- 
space configuration rather than the genotype (binary 
string). Replacing, at each generation, the population 
(x, y, ...) with (G a x, Gby, ...), where the symmetry oper- 
ations G a , Gb , ■ • ■ are chosen at random allows all sym- 
metrically equivalent configurations of the members in 
the current population to be accessible with equal prob- 
ability at any step (generation). Thus, problems due to 
synchronisation is avoided, allowing the building-blocks 



of the optima to form also when the evolution is govern 
by genetic drift 13. |24|. 

Turning to our ID periodic Ising model introduced 
above, we show how the symmetry can be incorporated 
in a straightforward manner within a conventional ge- 
netic algorithm scheme. Note that the ID Ising problem, 
due to periodic boundary conditions, does not have the 
geometry of a string, but of a circle 



x = x\x%...Xi...Xjsi with xjv + i = x±, 



(0.7) 



and hence the encoded representation (genotype) can be 
trivially chosen isomorphic to the phenotype. Further- 
more, when the overall composition is fixed, the genotype 
possesses the point-symmetry of an TV-sided polygon and 
transforms according to the irreducible representations of 
the dihedral group, of order 2N . 

Having identified the symmetry of the problem, we now 
introduce a symmetry-adapted crossover (SAC) as fol- 
lows. After applying the crossover (equation 10. 4|) . we act 
with a randomly chosen clement G a e Djv on the child, 



x' = G a (x), 



(0.8) 



and add, instead of x, x' to the population. 

To visualise the dynamics of utilising SAC, we return 
to our 100 spin Ising model showing in figure [2] a typical 
scenario. As can be seen by comparing the populations 
after 300 and 1000 steps, the use of symmetry adapted 
mating operations clearly enables the building blocks of 
good solutions (e.g. ****il00**** and ****0011****) 
to combine to form higher order building blocks of even 
better solutions (e.g. **11110000** and **00001111**). 
Since the operations G a (x) leaves the energy of x invari- 
ant without altering the order in which the atoms occur 
in the representation, large swaps can still be carried out 
in the configurational space when the diversity using con- 
ventional GA would have been lost (see figure QJ. That 
is, the synchronisation problem is solved by the use of a 
phenotype representation and symmetry-adapted opera- 
tors by means of inheriting the symmetry of the problem 
within the genetic algorithm. Sufficient diversity is there- 
fore added, which enables the schemata of good solutions 
to combine to from schemata of even better solutions. 

Now, having presented the use of symmetry-adapted 
operators, we turn our attention to the design of SAC 
for the study of 2D alloy models. Since the atoms are 
distributed on a square lattice, the underlying symmetry 
of the primitive unit-cell is that of the plane-group pAm. 
Symmetrically equivalent children are thus formed by ap- 
plying the combined operations G a T a where G Q £ p4m 
and T a is a translation operation. Using a representation 
which is isomorphic to the phenotype involves the use of 
real-space operations. That is, cut-and-splice-type op- 
erators which are applied directly on the configurations 
rather than via the use of binary encodings. However, in 
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FIG. 3: Evolution of the best energy based on 1000 GA runs 
of a 10 x 10 atom Ising model. See the main text for details. 
The full line is that of using SAC, whereas the dotted and 
dashed lines are GA runs carried out using a conventional 
genetic algorithm with and without mutations respectively. 
The energy of a global minimum is -180. 



FIG. 2: The populations at three stages during the evolution 
using a genetic algorithm in conjunction with SAC applied 
to a ID 100 spin Ising model with a fixed 50:50 composi- 
tion of species A and B. Black and white squares are used to 
represent the different species. Top figure shows the initial 
population, the middle figure is the population after 300 gen- 
erations (330 fitness evaluations) and the bottom figure shows 
the population after 1000 steps (1030 fitness evaluations). 



order to compare with that of Smith [14], we have chosen 
to use a representation where the atoms were mapped in 
major column form to a binary ring. Since this represen- 
tation is not isomorphic to the phenotype, the act with 
a symmetry operations on the children x' may alter the 
order in which the atoms are arranged in the representa- 
tion. However, since this representation strongly reflects 
the positions of the atoms in the phenotype the act of a 
symmetry-operation will only alter very few atoms in the 
encoded solution and the crossover will work properly. 



RESULTS AND DISCUSSION 
Finding low energy minima of a 2D Ising model 

Results from GA calculations using a 100 atom Ising 
alloy with a square unit-cell and equal concentrations of 
species A and B are shown in figure [3) Results from cal- 
culations with and without the use of symmetry adapted 



crossover is reported. In this test J = — 1 which promotes 
a high degree of mixing between the different species. 
1000 independent runs were carried out, and the mean 
value of the best energies at each step is displayed in 
figure [31 A steady state genetic algorithm, where the 
child replaces the worst fit member in the population, is 
used. The population-size is 30. The conventional GA 
runs were carried out both with and without a mutation 
of exchanging a pair of atoms (using equation 10 . 5[) with 
probability 0.05, whereas no mutations were used during 
the SAC runs. 

As can be seen from the figure, the use of symmetry- 
adapted operators clearly outperform the conventional 
GA which fails in all attempts to find the global mini- 
mum, in agreement with previous studies 14, 1^|. Even 
with huge population sizes (> 500) the conventional ge- 
netic algorithm is unable to find a global minimum with- 
out the use of mutation operations. However, as can be 
seen, the convergence when using a simple mutation such 
as exchanging a pair of atoms is slow. By contrast, in all 
1000 SAC optimisations, less than 1500 fitness evalua- 
tions were required. We find that SAC is very robust even 
when very small populations-sizes (20 members) are used, 
finding global minima in all 1000 GA runs, although the 
use of larger populations (> 40 members) slows down the 
performance. By contrast, if the population size is too 
small (< 10 members) GA have the tendencies of beeing 
hampered by premature convergence. Using small popu- 
lations (20 members) is particular advantages when the 
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fitness evaluation is expensive since a coarse tuning of 
the parameters can be carried out at relative low cost. 

A few test calculations using SAC were also carried 
out with mutations. Results from these tests indicate 
that the use of mutations slows down the performance, 
although marginally, which is not surprising since the 
diversity of the population is maintained with SAC. 

It is worth mentioning that Smith 1J] is able to find 
a global minimum using a conventional specialised ge- 
netic algorithm with a modified mating scheme and large 
mutations. However, more than 5000 fitness evaluations 
were typically required, which is about an order of mag- 
nitude slower than that of using SAC, and attempts to 
solve larger problems (e.g. a 2D Ising alloy with 400 
atoms) failed. By contrast, we were able to locate the 
global minimum of a 400 atom Ising alloy in less than 
2000 steps. Attempts to find the global minimum of Ising 
alloys with more than 10 000 atoms were successful! 




conv. GA (E cut incl. up to fourth nearest neighbours) 



SAC (E cut incl. up to fourth nearest neighbours) 
SAC (E cut incl. nearest neighbours only) 



500 1000 1500 2000 2500 3000 
Steps 



Finding low energy minima of a 2D Morse alloy 

Having compared the performance of a conventional 
genetic algorithm with that of using SAC for finding low 
energy minima of a simple Ising model, we now turn to 
discuss the second model where the atoms interact via 
a Morse-potential (equation I0.3|) with parameters taken 
from tableland cutoffs, E cut in equation l0.31 chosen such 
as to the include either nearest neighbour interactions or 
up to the fourth nearest neighbours. Again, a periodic 
10 x 10 atom cell is considered where 50 Bi and 50 Cu 
atoms are distributed on a square lattice with lattice con- 
stant 2.56 A. 1000 runs were carried out, and the mean 
value of the best energies at each step are displayed in 
figure 3J Results from calculations with and without the 
use of symmetry adapted crossover is reported, and no 
mutations were used. 

Again, rapid convergence is achieved using SAC, and 
less than 1100 steps were necessarily in all 1000 runs to 
reach the global minima. The conventional genetic al- 
gorithm fails in all attempts, hampered by premature 
convergence. Smith 1J] showed that specialised conven- 
tional GA when applied for finding low energy minima 
on a Morse potential performed worse than a Metropo- 
lis Monte Carlo algorithm. It was argued that the spe- 
cialised crossover, which was successful when applied to 
simple Ising alloys, fails to work properly when there is a 
marked difference in the bond energies. If the atoms were 
allowed to relax in space the specialised mating operation 
would work properly because of the similar Cu-Cu and 
Bi-Bi bond energies. Since, in the present work, we do 
not resort to the use of specialised operators, it is of in- 
terest to compare the performance of SAC with that of 
Metropolis MC 14]. Comparing the MC values in figure 
7 in ref [3 with that of SAC, we find that the MC runs 
appear to converge faster than SAC early in the evolu- 



FIG. 4: Evolution of the best energy based on 1000 GA simu- 
lations of a 10 x 10 atom Morse alloy. See main text for details. 
The full lines are results from that of using SAC whereas the 
dashed line is GA runs carried out using a conventional ge- 
netic algorithm without mutations. The thick full line is the 
result from runs where up to the forth nearest neighbour in- 
teractions were included in the energy, whereas the thin full 
line is result with only nearest neighbour interactions. The 
ground state energy is eV in both cases. 



tion. However SAC clearly outperforms MC in finding 
the global minima rapidly. The high efficiency is due to 
a properly working crossover throughout the evolution 
which is reflected in a steep, roughly linear, decay in the 
mean energy, followed by an exponential decay near the 
groundstates. By contrast, in Metropolis MC changes 
to a single solution is being made which is manifested 
in a different functional form of the mean energy which 
progress at a much slower rate in the latter part of the 
evolution. 

We also carried out test calculations by increasing the 
cutoff distance in the Morse-potential such as to include 
up to the fourth nearest neighbours. Interestingly, as can 
be seen in figure [4J the performance of the genetic algo- 
rithm is not very sensitive to changes in the cutoff of the 
Morse-potential which is encouraging for the application 
of the presented genetic algorithm for modelling substitu- 
tionally disordered compounds with long-range interac- 
tions. In addition, the robustness of the presented genetic 
algorithm in handling different alloy models (i.e. with 
different functional form) without having to redesign the 
algorithm, is particular promising for the study of com- 
plex minerals and alloys as well as related problems (e.g. 
magnetism) . 
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CONCLUSIONS 

We have presented a novel genetic algorithm for finding 
configurations with target properties of substitutionally 
disordered materials (i.e. alloys and minerals) as well 
as surfaces in which premature convergence hampering 
conventional GA is avoided. The presence of a large 
number of symmetrically equivalent low energy config- 
urations, which in general are far away from one-anothcr 
in configuration space, is a challenge to conventional GA 
since there is no selection in the direction of a single 
global minimum. The conventional genetic algorithm has 
a synchronisation problem and the crossover fails to work 
properly when the diversity of the population is lost, leav- 
ing the the problem of finding a global minimum to the 
mutation operator alone. 

We solve the synchronisation problem using a 
symmetry-adapted crossover in combination with repre- 
sentations which reflect the spatial arrangements of the 
atoms, by replacing the offspring with a symmetrically 
equivalent configuration. The use of symmetry-adapted 
mating operations allows one to carry out large swaps in 
configuration space, enabling the building blocks of good 
solution to combine to form higher order building blocks 
of even better solutions. 

Calculations on simple 2D alloy models (i.e. an Ising 
alloy and a Morse alloy) show that the use of symmetry- 
adapted operators outperform a conventional genetic al- 
gorithm, which fails in all attempts to find a global min- 
imum even with huge population sizes (i.e. with very lit- 
tle amount of selection). Although Smith 14j is able to 
find the global minimum of a 100 atom Ising model using 
a specialised genetic algorithm designed to solve simple 
Ising models, the use of this genetic algorithm fails to 
outperform Metropolis Monte Carlo algorithm when ap- 
plied to Morse alloys. By contrast, we have shown that 
genetic algorithms in combination with SAC outperform 
Metropolis Monte Carlo-, and specialised GA methods 
in both cases, and global minima of large systems (e.g. 
alloys with more than 10 000 atoms) were found. 

The robustness of the presented algorithm in handling 
alloy models with different interactions without the need 
for any redesign, is particular promising for future appli- 
cations. 
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